Golang Job: Site Reliability Engineer - Client Support Service

Job added on

Location

Rio de Janeiro - Brazil

Job type

Full-Time

Golang Job Details

As part of Kraken's Client Support Services (CSS) SRE Team, you will work within a world-class team of engineers building Kraken's infrastructure. As a Site Reliability Engineer, you will be keeping one of the fastest growing companies in the world up and available in a 24/7 environment. You will bring your own technical expertise to monitor and support staging and production environments, build tooling, CI/CD pipelines, deployment specs and generally automate internal processes to empower developers and improve team efficiency.

Responsibilities

  • Monitor and support Staging and Production environments
  • Improve Developer Tooling, help with building Docker images, manage our Continuous Integration (CI) pipelines for automating quality testing
  • Manage releases using Kubernetes and Nomad
  • Implement tooling to keep track of key metrics and generate alerts
  • Collaborate with Dev, QA, and Product teams to support and improve the development and release cycle
  • Develop tools and bots to improve and automate internal processes
  • Support a fully distributed team operating across numerous timezones

Requirements

  • 3+ years experience working in a SRE, DevOps or equivalent experience as a Backend Developer working with Infrastructure
  • 1+ years experience with a programming language (NodeJS, Rust, Golang, or Python)
  • Extensive experience with monitoring tools such as Grafana, Prometheus, Splunk, and ELK
  • Thorough knowledge of Docker and orchestration tools such as Kubernetes or Nomad
  • Ability to configure and maintain different types of proxy services such as Nginx and HAProxy
  • Proficient in Git source version-control
  • Passion for improving process and products
  • Experience configuring Continuous Integration (CI)
  • Ability to thrive while working independently and remotely in a team-based environment
  • Self-starter, ability to context-switch between various projects, codebases and concepts
  • Ability to independently debug problems involving the network and operating system
  • Well-versed in scripting languages, building and administration of Linux
  • Interest in security and a thoughtful and thorough consideration of the security implications of development decisions

Bonus Points

  • Passion for open-source and contributing back to the community
  • Knowledge about Cloudflare Caching, Page Rules and Workers
  • Experience with Hashicorp Vault and its PKI features
  • Experience with Kubernetes for Local development tools such as Tilt
  • Experience with ReactJS and/or NextJS frameworks
  • Experience with Cloud infrastructure
  • Experience benchmarking applications and identifying bottlenecks
  • Experience with Slack, Jira, Google, and/or Gitlab APIs
  • Experience with monitoring / alerting (primarily with Prometheus / Grafana) and knowledge of best practices in the area
  • Experience with distributed systems and technologies (gRPC, Kafka, NoSQL, SQL, Redis, ...)

Job Types: Full-time, Contract